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The invention relates to an encoding method for the compression of a video sequence 
by means of a tridimensional wavelet transform. This method is based on a hierarchical subband 
encoding process leading to transform coefBcients constituting a hierarchical pyramid. A spatio- 
temporal orientation tree, in which the roots are formed with the pixels of the approximation 
subband and the offspring of each of these pixels is formed with the pixels of the higher subbands, 
defines the spatio-temporal relationship inside said pyramid. According to the invention, the initial 
siU>band structure of the wavelet transform is preserved, in the encoding process, by scanning 
the subbands one after the other in an order that respects the parent-offspring dependencies 
formed in the tree. Moreover, flags " ofi / on " are added to each coe£Gcient of the tree in view 
of a progressive transmission of the most significant bits of the coefBcients, at least one of them 
describing the state of a set of pixels, and at least another one describing the state of a single pixel. 
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FIELD OF THE INVENTION 

The present invention relates to an encxdmg method for the compression of a' 
video sequence divided in groups of frames decomposed by means of a tridimensional (3D) 
wavelet transform leading to a given number of successive resolution levels, said method 
being based on a hierarchical subband encoding process leading from the original set of 
picture elements (pixels) of each group of frames to transfonm coeffidents constituting a 
hierarchical pyramid, a spatio-temporal orientation tree - in which the roots are formed with 
the pixels of the approximation subband resulting from the 3D,Jwavelet transform and the 
offepring of each of these pixels is formed with the pixels of the higher subbands 
corresponding to the image volume defined by these root pixels - defining the spatio- 
temporal relationship inside said hierarchical pyramid. 

BACKGROUND OF THE INVENTION 

The expansion of multimedia applications is now making the scalability one 
of the most important functionalities of video compression schemes. Scalability allows 
delivering multiple levels of quality or spatial resolutions/frame rates in an embedded 
bistream towards receivers with different requirements and encoding capabilities. Current 
standards like MPEG-4 have implemented scalability in a predictive DCT-based framework 
through additional high-cost layers. More efficient solutions based on a tridimensional 
wavelet decomposition followed by a hierarchical encoding of the spatio-temporal trees 
like the Set Partitioning In Hierarchical Trees algorithm (SPIHT) have been recently 
proposed as an extension of still image coding techniques (the algorithm SPIhfT is 
described for instance in "A new, fast, and efficient image codec based on set partitioning 
in hierarchical trees", by A. Said and W.A. Peariman, IEEE Transactions on Qrcuits and 
systems for Video Technology, voL6, n^'S, June 1996, pp.243-250). The 3D wavelet 
decomposition provides a natural spatial resolution and frame rate scalability, while the 
in-<lepth scanning of the coeffidents in the hierarchical trees and the bitplane encoding 
lead to the desired quality scalability with a high compression ratio. 

The SPIHT algorithm is based on a key concept : the prediction of the 
absence of significant infomnation across scales of the wavelet decomposition by 
exploiting self-similarity inherent in natural images. This means that if a coefficient is 
insignificant at the lowest scale of the wavelet decomposition, the coefficients 
corresponding to the same area at the other scales have a high probability to be 
Insignificant too. Basically, the SPIHT is an iterative algorithm that consists in comparing 
a set of pixels corresponding to the same image area at different resolutions with a value 
called "level of significance" from the maximal significance level found in the spatio- 
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^^temporal decompaSnon tree down to 0. For a given level, or bitplane, there are two 



passes : the sorting pass, which looks for zero-trees or sub-trees and sorts insignificant 
and significant coefFtdents, and the refinement pass, which sends the precision bits of the 
significant coefRcients. 



level of the decomposition to the lowest. This corresponds to first considering the 
coefficients corresponding to important details located in the smallest scale subbands, 
with increasing resolution, then examining the smallest coefficients, which correspond to 
fine details. This justifies the "hierarchical" designation of the algorithm : the bits are sent 
by decreasing importance of the details they represent, and a progressive bitstream is 
thus formed. 



orientation tree, defines the spatial (and temporal) relationship inside the hierarchical 
pyramid of wavelet coefficients. The roots of the trees are formed with the pixels of the 
approximation subband at the lowest resolution ("root" subband), while the pixels of the 
higher subbands corresponding to the image area (to the image volume, in the 30 case) 
defined by the root pixel form the offspring of this pixel. In the 3D SPIHT algorithm, each 
pixel of any subband but the leaves has 8 offspring pixels, and each pixel has only one 
parent. There is one exception at this rule : in the root case, one pixel out of 8 has no 
offspring. The following notations describe the parent-offspring relati'onship, an 
illustration of these dependencies being given In Ftg.l (tridimensional case) : 

0(x,y,z) : set of coordinates of the direct offspring of the node (x,y,z) ; 

D(x,y,z) : set of coordinates of all d^cendants of the node (x,y,z) ; 

H(x,y,z) : set of coordinates of ail spatio-temporal orientation tree roots (nodes in the 
highest pyramid level : spatio-temporal approximation subband) ; 

L(x,y,z) = D(x,y,z) - 0(x,y,z). 

The SPIHT algorithm makes use of three lists : the LIS (list of insignificant 
sets), the LIP (list of insignificant pixels), and the LSP (list of significant pixels). In all 
these lists, each entry is identified by a coordinate (x,y,z). In the LIP and LIS, (x,y,z) 
represents a unique coefficient, while in the LIS it represents a set of coefficients 0(x,y,z) 
or L(x,y,z), which are sub-trees of the spatio-temporal tree. To differentiate between 
them, the LIS entry is of type A if it represents D(x,y,z), and of type B if it represents 
L(x,y,z). During the first pass (sorting pass), all the pixels of the LIP are tested and those 
that become significant are moved to the list LSP. Similarly, the sets of the LIS that 
become significant are removed from the list LIS and split into subsets that are placed at 
the end of the LIS and will be each examined in turn. The LSP contains the list of 
significant pixels to be "refined" : the n*** bit of the coefficient is sent if this one is 
significant with respect to the level n. 



The SPIKT algorithm examines the wavelet coefficients from the highest 



A tree structure, called spatial (or spatio-temporal in the 3D case) 



^ ' TTieWlHT approach is designed to provide quality scalability assodated with 
a high compression ratio. However, scalability in temporal or spatial resolutions cannot be 
obtained with this coding strategy without modifications. To improve ttie global 
compression rate of the video coding system, it is usually advised to add an arithmetic 
encoder to the zero-tree encoding module (algorithms EZW -for Embedded Zerotr^ 
Wavelet- or SPIhfT). In other approaches, the arithmetic coding uses pertinent contexts 
directly applied to the subbands for lossless image compression. Most of the time, the 
hierarchical and arithmetic coding modules are considered separately. To efficiently 
combine them in a single coding system, some modifications have to be performed on 
the original SPIHT algorithm. 

To make the arithmetic coding efficient, it is veryjmportant to capture all the 
information that may have some influence on the current pixel and particularly the 
Infonmation related to neighbouring pixels. This infonmation is represented by its context 
The In-depth search performed when scanning for zero-trees does not exploit the 
redundancy inside subbands and makes harder the determination of a relevant context 
for the arithmetic coding. The manipulation of the lists LIS, LIP, LSP conducted by a set 
of logical conditions makes the order of pixel scanning hardly predictable. The pixels 
belonging to the same 3D offspring tree but coming from different spatio-temporal 
subbands are encoded and put one after the other in the lists, which has for effect to mix 
the pixels of foreign subbands. Thus, the geographic interdependendes between pixels of 
the same subband are lost. Moreover, since the spatio-temporal subbands result from 
temporal or spatial filtering, the frames are filtered along jprivileged axes that give the 
orientation of the details. This orientation dependency is also lost when the SPIHT 
algorithm is applied, because the scanning does not respect the geographic order. 

Furthermore, the bits resulting form the examination of the LIS, LIP, LSP and 
the signs of the coefficients have quite different statistical properties. The relevant 
contexts for one list can be totally different from another. For example, as the UP 
represents the set of insignificant pixels, it is reasonable to suppose that if a pixel is 
surrounded by Insignificant pixels, it has great chance to be insignificant too, but this 
supposition seems bolder for the LSP : it cannot be necessarily deduced that the 
refinement bit of an examined pixel is one (resp. zero) if the refinement bits of its 
neighbours are ones (resp. zeros) at a certain level of significance. 

Faced with the difficulties to add an entropy coding stage to the SPIHT 
algorithm, the documents that relate such an implementation are quite elusive, or even 
skeptical about the efficiency of the proposed solutions. Most of the time, the hierarchical 
coding methods and the context-based lossless image compression methods are 
confronted in the case of still pictures. In a previous european patent application filed on 
April 4th, 2000, by the applicant under the number 00400932.0 (PHFR000032), an 
encoding method that respects the constraints of quality scalability and enhances the 




compression raflWlas been proposed. This method, based on the effident insertion of a 
context-based arfdimethic encoder into the 3D SPIHT algorithm, aims at keeping as long 
as possible the neighbourhood of pixels inside the lists, but, finally, this goal Is only 
partially achieved because of the constraints imposed by the parent-oflfepring relationship 
and the use of lists. It appears therefore that the SPIHT encoding strategy is very 
eflRcient to provide a fully quality progressive bitstream with a high compression rate, but 
that the hierarchical structure used In said strategy however does neither fadlftate the 
insertion of a context-based adaptive arithmetic coding nor the functionality of spatial or 
temporal resolution scalability, which is also strongly required by emerging multimedia 
applications. 

SUMMARY OF THE INVENTION 

It is therefore an object of the invention to propose a new strategy for 
encoding the spatio-temporal wavelet coeflRdents, inspired from the 3D-SPIHT, but which 
allows a better context selection while allowing to obtain a spatial or temporal resolution 
scalability in the coding scheme. 

To this end, the invention relates to an encoding method such as defined in 
the introductive part of the description and which is moreover characterized in that : 

(A) in said encoding process, the initial subband structure of the 3D wavelet 
transform is preserved by scanning the subbands one after the other in an order that 
respects the parent-offepring dependencies formed in said spatio-temporal tree ; 

(B) flags " off / on " are added to each coeffident of the spatio-temporal tree in 
view of a progressive transmission of the most significant bits of the coeffidents, these 
flags being such that at least one of them describes the state of a set of pixels and at 
least another one describes the state of a single pixel. 

Although the use of lists LIS, UP and LSP in the original SPIKT algorithm 
facilitates the classification task. It is an obstacle to a geographic organization of the 
coefficients. By using the present technique, the initial subband structure of the 
3D wavelet transform is preserved, and a flag added to each coefRdent Indicates to 
which list LIS, UP or LSP this coefficient belongs. Thus, tiie scanning of the lists is 
replaced by a subband scanning and a flag interpretation : tiie hierarchical and logical 
organization of the SPIKT is preserved, but instead of moving a coeflident from a list to 
another, tills is "virtually" done by changing its flag. The interest of this 'Virtual moving" 
is that the order of reading is not dependent of the changes performed by the logic of the 
SPIHT algorithm, which is particularly Interesting for the refinement pass, since the 
refinement bits constitute the greatest part of the bitstream. 



BRIEF DESCRIPTION OF THE DRAWINGS 



^ * The present invention will now be described, with reference to the 
accompanying drawings in which : 

- Rg.l gives ecamples of parent-oflfepring dependencies in the 3D case, in 
the spatio-temporai orientation tree ; 

- Rg.2 illustrates the hierarchy of the subbands In the spatio-temporal tree ; 

- Rg.3 shows a spatially-driven scanning of the spatio-temporal tree ; 

- Rg.4 depicts a bitstream organization made possible by the ordered 3D 

SPIKT ; 

- Rg.5 shows a temporally-driven scanning of the spatio-temporal tree, and 
Rg.6 depicts the structure of the bitstream obtained with said* scanning ; 

- Rg.7 illustrates a combination of SNR, spatial and temporal scalabiliti'es 
using the spatially-driven scanning strategy ; 

- Rg.8 depicts the determination of the context for the last bit marked ; 

- Rg.9 shows the hierarchical organization of the bitstream without 
resolution flags. 

DETAILED DESCRIPTION OF THE INVENTION 

In the considered method, the whole spatio-temporal tree is fully scanned 
for each new bitplane. At the end of the first bitplane, all the offspring dependences of 
the 3D volume have been evaluated. This first scanning is therefore quite critical and 
must absolutely respect the calculation order of the offspring dependendes described In 
Rg.2. According to the invention, the proposed algorithm scans the subbands one after 
the other in an order that respects the parent-offepring relationships. l=our different flags 
can be added to the coefiRcient of the spatio-temporal tree : 

A) two of them describe the state of a set (trees or subtrees) : 

- DIRECr_SET.INSIG (or FSl) if D(x,y,z) is still insignificant ; 

- UNDIRECT_SETJNSIG (or FS2) if L(x,y,z) is still insignificant. 

B) the two other ones describe the state of a single pixel : 

- SIG (or FP3) If the current pixel is significant ; 

- INSIG (or FP4) if it is not significant, or if its significance is to be analyzed (put by 
default to the pixels that are not included in a zero-tree). 

The main steps of the algorithm implemented in the present method are : 

1. Initialization : 

- Put flag FP4 to all the coefficients of the lowest spatio-temporal subband ; 

- Put flag FSl to 7 over 8 coefficients of the lowest spatio-temporal subband. 

2. Calculate and output MSL (the maximum significance level found in the spatio-temporal 
decomposition tree) ; 

3. From n = MSL down to 0, do a full exploration of the spatio-temporal tree (two main 
approaches are possible, as described later (in the following paragraph) : spatially-driven 



rKoluton scalaBHRy, and tjemporally-driven resolution scalability), where, for each 
coeffident (x,y,z) of the spatio-temporal tree : 

a) set significance : 

1) if flag I=S1 Is "on", then output = Sn (D(x,y,z)). 
if Sn(D(x,y,z)) == 1, then: 

- for each (x'y ,z') € 0(x,y,z), put flag FP4 ; 

- remove flag FSl from (x,y,z) ; 

- if UiJ) qfc 0, then put flag FS2. 

2) if flag FS2 Is "on", then output = Sn (L(x,y,z)). 
ifS„(L(x,y,z))==: l,then : 

- for each (x',/,z') € 0(x,y,z), put flag FSl ; 

- remove flag FS2 from (x,y,z) ; 

b) pixel signlflcance : 

1) if flag FP3 is on, then output = the n^ bit of (x,y,z). 

2) If flag FP4 is on, then output = Sn (x,y,z). 

ifSn(x,y,z) = 1, then: 

put flag FP3 on ; 
- output sign (x,y,z) ; 
remove flag FP4. 

The frames are Altered along privileged axes (spatial or temporal) that give 
the orientations of the details. These orientations can be better taken into account by 
scanning the subband along the same directions. Using this algorithm, there are then 
two main ways of exploring the spatio-temporal volume of coeflRdents depending on 
the chosen privileged orientation chosen, which may be either the spatial or the 
temporal axis. Consequently, two types of "multi-scalable" bitstreams may be 
obtained, one leaded by the spatial resolution, the second by the temporal resolution. 

(A) spatially-driven resolution scalability : 

For each bitplane, the tree scanning is spatially oriented, since in this 
scheme the spatial resolutions are fully explored one after the other as shown in Fig.3. 
Inside each spatial scale, all the temporal resolutions are successively scanned- In 
other words, the temporal frequency is higher than the spatial one. In order to have 
the possibility to skip some part of the bitstream, it is necessary to introduce 
resolution flags in the bitstream. The scanning strategy leads to a video bitstream 
organized as indicated in Fig-4, where the notations are the same as in ng.2. 

(B) temporally-driven resolution scalability : 

For each bitplane, the tree scanning is temporally oriented, since in this 
scheme the temporal resolutions are fully explored one after the other as shown In 
Fig.S. Inside each temporal scale, all the spatial resolutions are successively scanned 
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and therefor«rthe spatial frequencies are available. This scanning strategy leads to 
a video bitstream organized as indicated in Rg.6, with the notations defined in Rg.2. 

In both cases, the three types of scalability (temporal, spatial resolution, 
SNR) are obtained : 

- SNR scalability is still available since the spatio-temporal scanning is Inserted in a 
bitplane iterative loop ; 

- temporal and spatial scalability are provided respectively with t^,, possible frame 
rates and s^ax possible display sizes ; 

- 1 = 1 corresponds to the minimum frame rate^, t = 2 to 2* rate^in, etc ; 

- s = 1 corresponds to the minimum display size (S'^in, S'rhin), s = 2 to (2*(S'^in. 
2* S^min), etc. 

An example of selective decoding is illustrated in Fig.7. 

In the european patent application previously cited, four different models to 
encode properly the bits issued from the SPIHT have been distinguished. TTiis 
distinction has shown promising results and is kept in this scheme. However, this time 
the models for significant and insignificant pixels are not differentiated. Both of them 
can be grouped into a single statistical model for the refinement pass. Thus three 
models are considered, for : 

- the bits coding the sets or the subsets (marked by a flag FSl or FS2), 

- the refinement bits (pixel marked with the flag FP3 or FP4), 

- the sign bits. 

The advantages of the implementation of the method according to the 
invention are the following : 

(A) improvement of contexts : 

Thanks to the fixed subband scanning and Oie recognition of the flags, it is 
possible to reestablish a coherent geographic context for each model. It is particularly 
interesting for the coding of the significant pixels and tiieir refinement bits. Indeed, 
the SPIHT aims at reducing the redundancy between subbands of different scales, but 
it does not really take into account the geographic redundancy, unlike tiie context- 
based coding approaches. For the significant pixels, thanks to the algoritiim proposed, 
the same eflffciency can be reached. The rules of construction of tfie context are quite 
simple and an sample is given in the foltowing paragraph, iilustiating the context 
selection for the refinement bits. 

For a considered bitplane n, if a coefficient does not own the flag FP3, it 
means that all tfie bits until the n* are null. Using tills remark, tiie context is 
composed of O's or I's from tiie coeffldents marked by FP3 and O's from tiie ones 
having anottier flag or no flag. For example, in Rg.8, tfie bits of context used to code 
tiie last marked bit are shadowed. The contexts are tiie same on botfi encoder and 
decoder sides, so the probabilities are estimated correctiy at both sides. This mettiod 
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better exploiCTPIe neighbouring influence on the current ^xel than those which 
combine classical SPIHT algorithm and entropy coding. This method leads to a 
"natural" context, directly issued from the transformed image, in conformity with the ' 
bitplane approach, and not from the bits resulting from the original SPIHT algorithm in 
the refinement passes- It should improve the compression rate, as the context is really 
related to the bit being encoded. However, as it scans ail the subbands entirety, the 
computation time for the first levels is greater than with the former method. 
(B) trade-off between multi-scalability and bitstream overioad : 
The possibility to reconstruct video sequences with the desired frame rate 
and display sizes by extracting the corresponding fragnnents of the bitstream is a very 
attractive concept, but it is obtained at the expense of coding efficiency for two main 
reasons : 

(a) the bitstream fragments related to a particular spatial or temporal resolution need 
to be separated by a flag to make jumps possible. With the two scalability 
schemes described above, on the examples given, at least four separators are 
needed per bitplane and up to 12 bitplanes are currently used to encode the 
wavelet coefficients. 

(b) the context calculation of the adaptive arithmetic coding module must be 
reinitialized at the beginning of each new bitplane/spatial resolution/temporal 
resolution to ensure that any bitstream fragment will be processed at the decoder 
side in exactly the same conditons as at the encoder side. Therefore the 
multiplication of separators will unavoidably reduce the length of the consecutive 
bit sequences encoded by the arithmetic coding module and malces harder the 
probability estimation. However, as the subbands can be considered as non or 
partially stationary sources, this apparent drawback could be a quality. 

A trade-off must be found between full resolution scalability and arithmetic 
coding efficiency. To this end, an intermediate solution, which provides four levels of 
spatial and temporal scalabilities, is proposed. The minimal frame rate ratemm is always 
associated with the minimal display size (S'Vnin, S^min), to constitute the 1^ resolution 
level- As well 2* ratemm Is combined with the display size (2*S''min 2* S^min) etc. Rg.9 
illustrates this when there are four resolution levels in the decomposition of the group 
of frames (GOF). All the combinations that were previously possible (16 possibilities 
with 4 spatial levels and four temporal levels) are now restricted to four. 
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1. An enaxiing method for the compression of a video sequence divided in groups 
of frames decomposed by means of a tridimensional (3D) wavelet transform leading to a 
given number of successive resolution levels, said method being based on a hierarchical 
subband encoding process leading from the original set of picture elements (pixels) of ' 
each group of frames to transform coefficients constituting a hierarchical pyramid, a 
spatio-temporal orientation tree - in which the roots are formed with the pixels of the 
approximation subband resulting from the 3D wavelet transform and the offspring of each 
of these pixels is formed with the pixels of the higher subbands corresponding to the 
image volume defined by these root pixels - defining the spatio-temporal relationship 
inside said hierarchical pyramid, said method being further characterized in that : 

(A) in said encoding process, the initial subband structure of the 3D wavelet 
transfonm is preserved by scanning the subbands one after the other in an order that 
respects the parent-offspring dependencies fonried in said spatio-temporal tree ; 

(B) flags " oflF / on " are added to each coefficient of the spatio-temporal tree 
in view of a progressive transmission of the most significant bits of the coeffidents, 
these flags being such that at least one of them describes the state of a set of pixels 
and at least another one describes the state of a single pbcel. 

2. An encoding method according to claim 1, characterized In that, for each 
bftpiane, the tree scanning Is spatially oriented, all the temporal resolutions being 
successively scanned Inside each spatial scale and resolution flags being introduced 
between any two spatial scales- 

3. An encoding method according to claim 1, characterized in that, for each 
bitplane, the tree scanning is temporally oriented, all the spatial resolutions being 
successively scanned inside each temporal scale and resolution flags being introduced 
between any two temporal scales. 

4. An encoding method according to daim 1, characterized in that, for each 
bitplane, an intermediate tree scanning is performed, all the temporal and spatial 
resolutions of the same scale being jointly scanned and resolution flags being 
introduced between any two spatial/temporal scales. 

5. An encoding method according to anyone of daims 2 to 4, characterized in 
that two flags describe the state of a set of pixels and are, for each coeffldent (x,y,z) 
of said spatio-temporal tree : 

- FSl If D(x,y,z> IS still insignificant ; 

- FS2 If L(x,y,z) is still insignificant ; 

where D(x,y,z) is the set of coordinates of all the descendants of the node (x,y,z) and 
L(x,y,z) = D(x,y,z) - 0(x,y,z), with 0(x,y,z) being tiie set of coordinates of ttie direct 
oflfepring of the node (x,y,z), and two flags describe the state of a single pixel and 
are : 
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P^Wie current pixel is significant ; 



- FP^Wie current pixel is signiflcant ; 

- FP4 if it is not significant or if its significance is to be analyzed. 

6. An encoding method according to daim 5, characterized in that the 
exploration of the spatio-temporal tree, implemented in said scanning order. Includes, 
after an initialization step where the flag FP4 is put to all the coeffidents of the lowest 
spatio-temporal subband and the flag FSl to 7 over 8 coefficients of said lowest 
spatio-temporal subband, and the maximum significance level MSL is calculated, the 
following steps, carried out from the bitplane n = MSL down to the bitplane n = 0 and 
from the lowest subband resolution down to the highest one : 

(a) a first set of tests related to the set significance ; 

(1) if the flag FSl is "on", then output Sn (D(x,y,z)) : 
- if Sn (D(x,y,z)) == 1, then : 

- for each (x*,y',z') In 0(x,y,z), put flag FP4 ; 

- remove flag FSl from (x,y,z) ; 

- if L(x,y,z) not empty, then put flag FS2 ;. 

(2) if flag FS2 Is "on", then output S„ (L{x,y,2)) : 
-rfSn(Ux,y,z)) == 1, then: 

- for each (x',y',z') in 0(x,y,z), put flag FSl ; 

- remove flag FS2 from (x,y,z) ; 

(b) a second set of tests related to the pixel significance : 

(1) if the flag FP3 is "on", then output = the n-th bit of (x,y,z) ; 

(2) if the flag FP4 is "on", then output Sn(x,y,z) : 
-if Sn(x,y,z) = 1, then : 

- put flag FP3 "on" ; 

- output sign (x,y,z) ; 

- and remove flag FP4. 

7. An encoding method according to anyone of daims 1 to 6, characterized in 
that it also comprises a partial decoding step of the bitstream between two resolution 
flags, leading to a lower resolution/frame rate reconstructed video sequence. 

8. An encoding method according to claim 7, characterized in that the context 
used for the encoding of each bit related to the set signiflcance in an arithmetic coding 
module is built using the bits of the same bitplane of the last scanned neighbouring 
wavelet coefficients in the same spatiotemporal subband, these bits being the bits 
output during the flrst set of tests related to the set significance. 

9. An encoding method according to claim 7, characterized in that the context 
used for the encoding of each bit related to the pixel significance in an arithmetic 
coding module is built using the bits of the same bitplane of the last scanned 
neighbouring wavelet coeffidents in the same spatiotemporal subband, these bits 
being 1 if the neighbouring coefficients are marked by an FP3 flag and 0 if not. 
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Abstract 




The invention relates to an encoding method for the compression of a video 
sequence by means of a tridimensional wavelet transform. This method is based on a ' 
hierarchical subband encoding process leading to transfbnn coefficients constituting a 
hierarchical pyramid. A spatio-temporal orientation tree. In which the nxits are formed ' 
with the pixels of the approximation subband and the offspring of each of these pixels is 
fomied with the pixels of the higher subbands, defines the spatio-temporal relationship 
inside said pyramid. According to the invention, the initial subband structure of the 
wavelet transfonm Is preserved, in the encoding process, by scanning the subbands one 
after the other in an order that respects the parent-oflfepring dependencies formed in the 
tree. Moreover, flags " off / on " are added to each coefRdent of the tree In view of a 
progressive transmission of the most significant bits of the coefficients, at least one of 
them describing the state of a set of pixels and at least another one describing the state 
of a single pixel. 
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